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Abstract 

In 19*55 Levine introduced two linear equating procedures for the common- 
item no nequlvalent- populations design. His two procedures make the same 
assumptions about true scores; they differ in terns of the nature of the 
equating function employed. 

In this paper two paramet eri zat i ons of a classical congeneric model are 
introduced to model the variables in the Levine procedures for the external 
and interna] anchor cases. The models differ in the constraints imposed on 
certain effective test length parameters, as well as assumptions made about 
one co variance term. This modeling leads to simple expressions for true- score 
variances, reliabilities, and the so-called "Angoff error variances." 

Applying these two parameteri zat i ons of the classical congeneric model 
with the Levine assumptions leads to general equations (for both of the Levine 
procedures and both the external and internal anchor cases) that involve 
ratios of the effective test length parameters. This presentation facilitates 
i nterpretat i on. 

The role of synthetic population weights for both Levine procedures is 
considered, along with an alternative interpretation of one of Levine's 
procedures . 
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Congeneric Models and Levine's 
Linear Equating Procedures 

Levine (1955) introduced two linear equating methods for a design in 
which two non-equivalent populations take different forms of a test with a 
common set of equating items, or anchor. Levine referred to his two methods 
as major-axis procedures. Angoff discussed these methods in his 1971 chapter 
on Scales, norms, and equivalent scores in the second edition of Educational 
Measurement. Angoff f s chapter was reprinted in 1 96^ by the Educational 
Testing Service. Other authors who have treated one or both of these methods 
include Woodruff (1986, 1989), Kolen and Brennan (1987), Petersen, Kolen, and 
Hoover (1 989), MacCann (1990), and Hanson (1990). 

Levine's methods make assumptions about true scores and error scores. 
Consequently, to apply these methods, it la necessary to model the 
relationships among observed, true, and error scores. In this paper, a 
particular version of a congeneric model is employed in which the error 
variances are assumed to follow classical assumptions. Actually, two 
parameter izations of the model are employed--one that is associated with the 
common items constituting an external anchor, and the other for an internal 
anchor in which the common items are part of the full length forms , 

For both of Levine 1 s methods, this modeling leads to general equations 
that involve ratios of certain effective test length parameters. These 
parameters aid in presenting and interpreting results. It is also shown that 
Angoff's (198^, pp. 114-115) results for Levine's methods can be obtained from 
the results presented frure. 

The paper ends with a discussion of an alternative conception of one of 
Levine's methods, followed by a consideration of other issues of interpreta- 
tion and possible future research. 
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Terminology 

Terminology employed with Levine f s procedures has become somewhat 
confused or, at best, inconsistent i n recent years. In particular, Levine 
originally distinguished between his procedures in terms of their presumed 
appropriateness for equally reliable and unequally reliable tests. However, 
Woodruff (1986), Kolen and Brennan (1987), and Hanson (1990) have alt noted 
that Levine's results can be derived without making any assumptions about the 
reliabilities of the tests involved. Rather, the distinguishing difference in 
the derivations of the procedures is that the so-called "equally reliable" 
method is an observed- score equating procedure, whereas the so-called 
"unequally reliable" method uses observed scores in a linear relationship 
between true scores for the two forms. Here, therefore, to avoid perpetuating 
the impression that Levine's procedures make reliability assumptions, the 
procedures will be referred to as Levine's observed-score and true-score 
equating procedures. (Admittedly, the phrase "true-score equating" is a bit 
inaccurate because, as noted above, it is actually observed scores that are 
used in a true-score r el ationship , but this inconsistency seems slight 
compared to the potential misunderstanding inherent in the phrases "equally- 
reliable" and "unequally-reliable,") 

Also, as noted by Woodruff (1989), a distinction can b^ drawn between the 
results for Levine f s procedures as expressed by Levine (19 J >5), and a 
particular case of Levine's results that Angoff 0 98*0 provides. In terms of 
formulas, Levine's "general" results and Angoff s version of f.Pvine'^ results 
can be distinguished by the fact that Levine 0 9V0 typically expresses true- 
score variance as observed-score variance times reliability, without 
specifying a specific reliability coefficient. By contrast, Angoff s versions 
of Levine's results are based on specific reliability coefficients that are 
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derived using "Angoffs error variances" (see Angoff, 1953). In this paper, 
whenever there is the potential for confusion with respect to this 
distinction, "Levine-Angof f" will be used to designate Ang ~f's (198U) results 
for Levine's methods. Largely, this paper deals with alternative and somewhat 
more general derivations and presentations of the Levi ne- Angoff results- a 
presentation intended to aid interpretation. 

Scoe Results for the Classical Congeneric Model 
Let X and V designate observed scores for two tests or sets of items. 
For the congeneric model, X and V are decomposed as follows: 

X " T x * E x " (X x T * V + E x and (,) 

V V V v v Ke - ' 

A particular version of the congeneric model arises when it is further 

specified that 

o*(E ) • A v o i (E) and r*\ 
o*(E y ) - * y o«(E) . ( „ 
This special version will be called here the classical congeneric model. It 
is discussed by Feldt (1975), Feldt and Brennan ( 1989, pp. 111-112) and 
Woodruff (1986), among others. The word "classical" is used here to indicate 
that the error variances are proportional to the "effective" test 
lengths, and X y . In this sense, this model is closer to the traditional 
classical test theory model than would be the case if Equations 3 and il did 
not hold. 

Discussed next are two par ameteri zat i om of this classical congeneric 
model. These parameter! zations differ with respect to constraints imposed on 
the A 1 s and assumptions about o(E ,E 5. The first case is for tests X and V 

X V 

disjoint and will be applied later to external anchor equating; the second 
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case is for V included in X and will be applied later to internal anchor 
equating. 

Tests X and V Disjoint 

Suppose that tests X and V are disjoint in the sense that they contain no 
common it eras* To represent this case, we assume that errors have an 
expectation of zero, all covariances between true and error scores are zero 
and, since X and V are distinct, 

a(E x ,E y ) - 0 . (5) 
For identif iabil ity purposes we impose the usual constraint 

* x * x v - 1 • (fi) 
(It is also usual to impose the constraint 5 ♦ fi « 0, but doing so is not 

X V 

required for the following derivations.) 

For this model, the variances and covariances are easily determined: 

o 2 (X) « A*o 2 (T) ♦ X a 2 (E) , (7) 
x x 

o 2 CV) = A 2 o 2 (T) * A y 0 2 (E), and (3) 
o(X.V) » x \ o 2 (T) . (Q) 

Further, letting A - X ♦ V (recall that X and V consist of non-overlappi ng 

sets of items) it is easy to show that 

o 2 (A) - o 2 (T) ♦ o 2 (E) . 

To derive A^ in terms of variances and covariances, note that 

o 2 (X) ♦ o(X,V) = A 2 0 2 (T) ♦ A o 2 (E) ♦ A A o 2 (P 

X X XV 

- * A )o 2 (T) ♦ 0 2 (E)] 

A A V 

- A x [o 2 (T) ♦ o 2 (E)] 
» A o 2 (A) . 
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It follows that the effective test length for X is 

A x = [o 2 (X) ♦ o(X,V)]/o 2 (A) (10) 

- o(X,X ♦ V)/a*(A) 

- o(X,A)/c 2 (A) 

■ a(X|A) , (11) 
where a(XjA) is the slope of the linear regression of X on A. Similarly, the 
effective test length for V is 

A y - [o 2 (V) ♦ o(X,V)]/o J (A) (12) 
a a(V)A) . (i3) 
Note that, [o 2 (X) ♦ o(X, V) ]/o 2 ( A) and [ 0 2 (V) ♦ o(X, V) ]/o 2 ( A) are both 
(relative) effective weights, as defined by Wang and Stanley (1970). Henoe, 
the effective test lengths A and \ are also interpretable as (relative) 
effective weights, as well as slopes of X (or V) on A » X ♦ V. 

For this classical congeneric model, using Equation 9 and noting 
that o 2 (T v ) = * 2 o 2 (T) , we obtain 

o 2 (T y ) . U y /X x ) o(X.V) . (ill) 
Consequently, the reliability of test V i3 
p(V.V') = o 2 (T y )/o 2 (V) 

- C*/* x > o(X,V)/o 2 (V) . (15) 
Also, using Equation U (and then Equations 10 and 12) the variance of the 
errors associated with test V is found to be 

o 2 (E v ) - o 2 (Y) - U/y o(X.V) (16) 

o 2 (X)o ; (V) - r 0 (X,V)1 2 
_ m , V{ ) 

o 2 (X) ♦ o(X t V) 

For test X, equations Tor true-score variance, reliability, and error 
variance can be obtained by interchanging X and V in Equations U-1? resulting 
in 
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o 2 (T ) - (A A ) o(X,V) . (18) 

A A V 

p(X,X') - o(X,V)/o 2 (X), and (19) 

o 2 (E ) - 0 »(x) - (A A )o(X,V) (20) 

m o 2 (X)g 2 (V) - [o(X,V)] ; ^ (2i } 

0 2 (V) + o(X,V) 

Equations 17 and 21 are the usual expressions Tor the so- called "Angoff 
error variances" for tests V and X, respectively, for X and V disjoint (see 
Angoff, 1953; Petersen et al., 1989, p. 25*0. 
Test V Included in Test X 

The previous section considered the case of tests X and V containing r.>> 
common items. In equating terminology this Is the case associated with V 
being an external anchor. Suppose now that test V is an internal anchor-- 
i.e., all of the items in test V are included in test X. In this case, the 
classical congeneric model Equations 1-4 still apply, and we assume that 
errors have an expectation of zero and all eovariances between true and error 
scores are zero. However, we replace the constraint in. Equation 6 with 

* x = 1 - (22) 
(For completeness it is typical to specify the constraint 6^ = 0, but doing so 
is not necessary for deriving the results that follow.) Setting A = 1 merely 

X 

specifies that, when V is included in X, the full-length test, X, has an 
effective length of t. Coasequently , for this model we let T = T and E * E, 

X X 

and Equation 1 can oe written 

X = T ♦ E » T ♦ E . 
x x 

Equation 5, o( E , E ) = 0, is not valid for this case, however. Rather, since 
V is Included in X, only the co variance between V and the no n- common part of X 
is zero. Therefore, 

o(E x ,E y ) = o(E,E v ) - o(E v ,E y ) - o*(E y ) . A y o 2 (E) . (23) 
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For this internal anchor classical congeneri c model , the variances and 
cova^ianees are easily found to be 

o 2 (X) - o 2 (T) ♦ o 2 (E) , (2H) 
o 2 (V) . X 2 o 2 (T) ♦ X y o 2 (E), and (25) 
o(X,V) =■ A v o 2 (T) ♦ X y o 2 (E) (26) 
- X y o 2 (X) . (27) 
From Equation 27, it is clear that 

\ v - o(X,V)/o 2 (X) - a(V|X) . (28) 
Again we find that the effective test length parameter A y is a slope. 

Recalling that o 2 (T y ) ■ X 2 o 2 (T) , and solving Equations 25 and 26 
simultaneously, we obtain 

A 

0 2 <V - j l x Cq(X.Y) - o 2 (V)] . (29) 
v 

Consequently, the reliability of test V is: 



p(v ,v') - * v o(x > v) - g2(v) 



v a 2 (V) 
Also, U3ing Equation 29 (and then Equation 23) 
o 2 (V) - X o(X,V) 

° J(E v' ■ <3'> 



v 

o 2 (X)o 2 (V) - [o(X,V>] 2 
o 2 (X) * o(X,V) 



(3?) 



Since o 2 (T y ) = X 2 o 2 (T) » X 2 o 2 (T x ) , it follows from Equation 29 that 
, (r s o(x,v) - g 2 (V) 

V V 

and the reliability of test X is 

nfv v> o(X ,V) - a 2 (V) 
P(X ' X } ^ XT 1 - X )o 2 (X) ' 

V V 
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CJnee o 2 (E y ) - X y o J (E) = X^HZ^) and X y = o(X,V)/o ? (X ), it follows from 
Equations 31 and 32, respectively, thcic 

o 2 (V) - X o(X,V) 

° 2( y " X (1 - X ) (35) 

V V 

m qjOO {a 2 (X)o 2 (V) - [g(X,\Q] 2 } _ ( 6) 

a(X t V) [o 2 (X) - o(X.V)] 

Equations 32 and 36 are the usual expressions for the Angoff error 

variances for V and X, r especti vely , for the case of V included in X (sec 

Angoff, 1953; Petersen et al. f 1989, p. 25*0. 

Comments 

Many of the results presented thus far have been provided implicitly or 
explicitly by others (e.g., Angoff, 1953; Feldt, 1975; and Woodruff , 1986). 
However, the particular form of some of the derivations presented here is 
somewhat novel and compact. 

Also, strictly speaking, not all of the results that have been presented 
are required to derive the Levine-Angof f results considered subsequently. In 
particular, the reliabilities and Angoff error variances are not required per 
se, but they are useful in relating expressions of results to be presented 
with corresponding expressions provided by Angoff (193^0, Kolen and Brennan 
(1987), and Petersen et al. (1989), among others, 

Levine Observed-Score Method 

The Levine observed-score method (elsewhere called the "equally reliable" 
method) for the common-item nonequi val ent-popul at i oris design was originally 
developed by Levine ( 1955). Angoff (1984, p. 115) and Petersen et al. (1989, 
p. 25^) also present descriptions of the method. Using a congeneric model, 
Woodruff (1936) derived a special case of the Levine-Angof f results. 
Subsequently, Kolen and Brennan (1 987) derived a more general version of the 
Levine-Angof f results using a framework that explicitly incorporates the 
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synthetic group concept originally introduced by Braun and Holland (1982). 
The derivation outlined below integrates the Kolen and Brennan (1987) 
presentation and the classical congeneric model results presented previously. 

Aosime that a new test form X is administereJ to population 1 and an old 
test torm Y is administered to population 2. f The adjectives "new" and "old" 
to describe forms X and Y, respectively, are used here for convenience only. 
There is nothing in the derivations th?t distinguishes between the "newness" 
or "oldness" of a fonn.) Also, assume that both populations take a common set 
of items, V, which may be distinct from X and Y or included in both X and Y. 
This is a description of the common- item no n- equivalent populations design. 

For this design, the two populations can be combined into a single 
population for definir , the equating relationship. To address this issue 
Braun and Holland (1982) introduced the concept of a synthetic population. 
Statistics for populations 1 and 2 are proportionally weighted by w» and w 2 , 
respectively, (i.e., w a ♦ w 2 - 1 withw,, w 2 £ 0) to obtai n statistics for 
the synthetic population. 

For the Levine observed- score method, the linear equation for equating 
scores on X to the scale of Y is 
o (Y) 
s 

"here s ind'cates the synthetic population. For exaninees in the 3yntietie 
population, the transformed observed scores on X [i.e., fc(X)] have the same 
mean and standard deviation as the observed scores on Y. 
Assumptions 

Letting T x , T y , and T y be true scores for X, Y, and V, respectively, 
Levine made the following three -assumptions in deriving his results (see Kolen 
& Brennan, 1987, pp. 266-267): 
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(a) T x and T y correlate perfectly ^or both populations, and the same 
condition holds for T y and T y ; 

(b) the linear function of T x on T y is the same for both populations, 
and the same condition holds for T y and T y ; and 

(c) measurement error variance for X is the same for both populations, 
and the same condition holds for Y and V. 

General Results 

Letting subscripts designate populations, Kolen and Brennan (1987, see 

especially pp. 267, 268, and 272) show that under the Levine assumptions the 

four parameters in Equation 37 can be represented as: 

U S (X) - m(X) - w a Y,[u»(V) - u 2 (V)] , (38) 
U s (Y) - u 2 (Y) + VjYaCMV) - n 2 (V)] , (39) 
0*(X) - a?(X) - w 2 Y?Co!(V) - oi(V)] * w^YfCy^V) - m 2 (V)] 2 , and (40) 
0*(Y) - oI(Y) * W,YlCo?(V) - oI(V)] ♦ w^YfCMV) - u 2 (V)]> ; ( U1 ) 

where the Y-terms are ratios of true-3Core standard deviations. In 

particular, 

Y x = o,(T x )/o,(T y ) and (42) 
* 2 - o 2 (T y )/o 2 (T y ) , (2i 3 ) 

(Angoff, 1987, and Brennan & Kolen, 1987, discuss and debate various issues 

with respect to choosing the weights Wj and w 2 .) 

When the classical congeneric model is applied to obtain Y, and Y 2 , the 

results discussed next are obtained for the external and internal anchor 

cases . 
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External Anchor 

Substituting Equations 14 and 18 into Equation 42 we obtain 



Y,« ✓ ( A /A )/(A /A ) 
x, v, v, x/ 

" *x/* Vl • (M) 

wnere the subscript 1 is used to specify that the data are for examinees in 
population 1. In terms of variances and covariances, the effective test 
length parameters in Equation 44 are given by Equations 10 and 12. Therefore, 

Y, « Cof(X) ♦ o 1 (X.V)]/[of(V) * o,{X,V)] . (Jl 5 ) 
Furthermore, the effective test length parameters in Equation 44 are also 
given by the slopes in Equations 11 and 13. Therefore, 

Y, - o 1 (X|A)/o,(V|A) , (i|6) 
where A » X + V. 

Corresponding equations for the old test form Y and population 2 can be 
obtained by substituting Y for X, 2 for 1 , and B = Y + V for A = X + V, in 
Equations 4*1-46 resulting in 

Yz 3 X y 2 A v 2 (47) 

- CofCY) ♦ o 2 (Y,V)]/[o|(V5 ♦ o 2 (Y,V)] (48) 

- o t (Y|B)/a,(V|B) . (ti9) 
Equations 44 and 47 state that, for the Levine observed-score method with 

an external anchor, Y, and Y 2 (i.e., the ratio of the true-score standard 
deviations in Equations 42 and 43) are ratios of effective test lengths in 
populations 1 and 2, respectively. 

Equations 45 and 48 are the most frequently reported expressions for 
the Y-terms (see, for example, Angoff, 1984, p. 115 and Kolen & Brennan, 1987, 
p. 272), but to this author these expressions lack the i nterpretabi 1 ity of the 
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effective test length ratios in Equations W and H7, and to some extent they 
lack the interpretability of the slope ratios in Equations 46 and 49. 
Internal Anchor 

Substituting Equations 29 and 33 into Equation 42, we obtain 



L v, 



1 /X 



x d - x.. ) 



1 - X 



1 -J 



(50) 



This, too, is a ratio of effective test lengths because, for the internal 
anchor case, the effective test length of X in population 1 is X Xj - 1 (see 
Equation 22), which is the numerator of Equation 50. Using Equation 28, an 
alternative expression for Y t is 

Y» - 1/o,(V|X) , (51) 
which is the expression provided by Angoff (1984, p. 115) and Kolen and 
Brennan (1987, p. 272). For the old form Y and population 2, 



1 /X 



1/o a (V|Y) . 



(52) 
(53) 



Coram ent 



The derivation that has been outlined here of Levine's observed- score 
method integrates the Kolen and Brennan presentation of this method with the 
classical congeneric model results presented previously, with emphasis placed 
upon the interpretation of the Y-terms as ratios of effective test lengths. 
Certain aspects of the approach, results, and interpretations presented here 
are also provided by Angoff (198*1, p. 114-115), Kolen and Brennan (1987), and 
Woodruff (1986). For example, Angoff's (1984) results are equivalent to those 
presented here when Wj = nj/(ni ♦ n 2 ) and w 2 = n 2 /(n, + n 2 ), where 
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n, and n 2 are sample sizes for populations 1 and 2, respectively. Woodruff's 



Levine (1955) also developed another method for the common- item non- 
equivalent populations design. Thi3 second method is called Levine's 
true-score" method here. (Elsewhere, it is called Levine's "unequally 
reliable" method.) Angoff 0984, p. 115) and Petersen et al. (1989, p. 254) 
present descriptions of the method. The assumptions about true scores for 
this method are the same as those for the observed- score method. What 
distinguishes the methods is that the linear equation for the true-score 
method is expressed in terms of certain true-score quantities, rather than the 
observed-score quantities in Equation 37. 

Specifically, for the true-3Core method, the basic linear equation is 



where T x designates the true score associated with a particular examinee's 
observed score. For examinees in the synthetic population, the transformed 
true scores on X [i.e., g(T x )] have the same mean and standard deviation as 
the true scores on Y. 

Clearly, however, examinees' true scores are never known. Therefore, the 
linear equation that is used in practice is 



(1986) results are equivalent to those presented here when w, ■ i. 



Levine True-Score Method 




too • JfA. cx - „ 9 (i x )] . „ 9 a > 




(5*0 




(5'5) 
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since true-score means equal observed-score means for the models considered 
here. Equation 54 or 55 is the Levine true-score equating function. The 
logic of using g(X) rather than g(T x ) is neither more nor less compelling 
than, for example, using observed scores in IRT true-score equating procedures 
(see Lord, 1980, p. 202). Note, in particular, that the transformed observed 
scores on X [i.e., g(X)] typically do not have the same standard deviation as 
the true scores on Y or the observed scores on Y. 
General Results 

Using the Kolen and Brennan (1987) approach, it can be shown that under 
. evine's assumptions: 

W (T ) - w q (X) - y,(X) - W,Y,[ui(V) - y 2 (V)] , (56) 

3 X S 

MJT > - M_ (Y) - Matt) * W|Y 8 [V/V) - Va(V)] , (57) 
s y 3 

o 2 (T ) - Y?o 2 (T ) , and (58) 

S X 3 V 

°s ( V " Y|o s ( V * {59) 
where o 2 (T ) » w,of(V) ♦ w 8 o|(V) ♦ w^Cn^V) - y 2 (V)] 2 . 

3 V 

Equations 56 and 57 are the same as the corresponding Equations 38 and 39, 
respectively, for the observed-score method. Equation 58 for the true-score 
variance of X in the synthetic population is derived in the Appendix, and 
Equation 59 can be derived in a similar manner. 

Since the assumptions for both the observed- score and true-score methods 
are the same, in general the Y-terras in Equations 56-59 are the ratios of the 
true-score standard deviations in Equations ^2 and ^3. Furthermore, the 
Y-tenns are the same as those derived previously using the classical 
congeneric models for the external and internal anchor cases „ Thus, 
the Y*s in Equations 56-59 are also interpretable as ratios of effective test 
lengths* 
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The simple form of Equations 58 and 59 leads to the slope of the linear 
equation for g(X) in Equation 55 being 

WW ■ Y ' /Y « (60) 
which is (A /A )/(A /A ) for the external anchor case, and A /A for the 

Xz v 2 x i v i v, V 2 

internal anchor case. As shown below, the intercept for g(X) can also be 
expressed relatively simply in terms of directly estimable parameters. Using 
Equations 56 and S7 wi th v - u,( V) - u 2 (V>, the intercept is 
U 3 (Y) - Co 3 (T y )/ 0s (T x )3u s (X) 

- M 2 (Y) ♦ w,Y 2 v - (Yj/Y^Cm^X) - W 2 Y,v] 

- U 2 (Y) - (Y a /Y t )u,(X) ♦ Y 2 (w, * w 2 )v . 
Since w, ♦ w, ■ 1 , it follows that the intercept equals 

[y 2 (Y) ~ (Yj/Y^y^X)] ♦ Y 2 [ Pl (V) - y 2 (V)3 . ( 61 ) 

Note that the slope and intercept do riot depend on the weights, w, and w 2 . 

Replacing Equations 60 and 61 in Equation 55 we obtain 
g(X) - (Y./Y.KX - M| (X)] ♦ M2 (Y^ ♦ Y 2 [ Ul (V) - m 2 (V)] . (6.?> 
Hence, g(X) for Levine's true- score method is invariant with respect to 
weighting of populations 1 and 2 in forming the synthetic population, or we 
might say that the concept of a synthetic population is not necessary to 
conceptualize this method's results. Even so, it is sometimes useful to 
display Levine's true-score method in the form of Equations 55-59 to compare 
it with Levine's observed-score method in the form of Equations 37-41. 

The usual presentation of results for the Levine true-score method is 
rather different from that presented here. Therefore, provided below are the 
"usual" Levine-Angoff results presented by Angoff (1984, p. 115), along with 
proofs of their equivalence to the results presented here (which assume, of 
course, the classical congeneric models discussed in this paper*. 
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External Anchor 

Angoff (1984, p. 115) states that, when V is an external anchor, the 
slope of g(X) Is 

°s<V »' ( *t V) g - (V ' V ' 1 (63) 
o (T ) " a,(XlV! p,(V,V') • 

Using Equation 15 for p t (V,V') and the parallel equation for the reliability 

of V on population 2, 

o (T } a 5 (Y|V) (X Vi /X Xi )o t (X,V)/of(V) 

TTt L ) " ai (X{V) U A )o 2 (Y,V)/o 2 (V) ' 

Since a.ttjv) - a» (X, V)/of ( V) and a 2 (Y|V) - o 2 (Y,V)/of (V) , it follows that 
s X /X 



0 (T } X /X 

S X Xi V! 

Finally, from Equations 44 and 47, we obtain the slope given by Equation 60. 

Angoff ( 1 98*4 , p. 115) also states that the intercept of g(X) with V being 
an external anchor is 

Since a 2 (Y|V) » o a (Y, V)/of( V) and, by the parallel of Equation 15 for 
populations, p 2 (V,V) - (X A )o 2 (Y, V)/o|( V) , it follows that 

"2 y2 

a 2 (Yj V)/p 2 ( V,V) - X /X - Y 2 by Equation 47. Therefore, since 

y 2 ^ 2 

0 (T )/o (T ) » Y 2 /Y x , the intercept given by Equation 64 can be written as 
s y 3 x 

y 2 (Y) - (Y t /Y,)ui(X) ♦ Y 2 [y,(V) - p 2 (V)] , 
which is identical to the result in Equation 62. 
Internal Anchor 

For an internal anchor, Angoff (1984, p. 115) st^es that the slope of 
g(X) is 

0 (T )/o (T ) - a,(V|x)/* a (V|Y) . (65) 
s y s x 1 1 

21 
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For the development considered he. o, by Equations 58 and 59, 50 and 5?, and 51 

and 53. respectively, the slope is 

o 8 (T ) y 3 X v t a,(V|X) 
a s (T x ) Y, 0l (V|Y) » 

which equals Equation 65. 

Angoff (1951, p. 115) also states that the intercept of g(X) for an 
internal anchor is 

o (T } 

Ui(Y) " W,(X) + a a (V Y) [Ml(V) " Ua(V):l ' (66) 

3 X 

Using Equations 53, 59, and 53, we can rewrite Equation 65 as 

u 2 (n - (y 2 /y l )u l w ♦ y 2 [ u ,(v) - mvh . 

which is identical to the result in Equation 62, 
First-Order Equity 

For the Levine true-score method, a function relating true scores is 
applied to observed scores. As noted previously, the logic of doing so is 
somewhat less than compelling, and it is not clear how the converted scores on 
X, [i.e., g(X)] are comparable to scores on Y. Hanson (1990), however, has 
shown that Levine's true-score equating function (Equation 5*0 for the common- 
item nonequi valent-popul ations design results in first-order equity of the 
equated test scores under a particular parameterization of the classical 
congeneric model. 

Before describing Hanson's modeling in more detail, we illustrate 

Hanson*s approach for the much simpler case of the single group design and the 

Levine true- score equating function 

g(X> - r .o(T )/0{T )J[X - y(T )] ♦ u(T ) . (67) 
y x x y 

(With the single group design, no synthetic population is involved. 
Therefore, there are no subscripts on the parameters in Equation 67.) 

9 
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Letting \j> be a function that relates true scores on X to true scores on 
Y, first-order, or weak, equity is defined as 

fi[g(X)|*(T ) - t3 - g(Y|T " T > for a11 T • (68) 
Under this definition, the transformed score g(X) is defined to be equivalent 
to Y if the expected value of the conditional distribution of g(X> given 
i|>(T ) » i equals the expected value of the conditional distribution of Y 

X 

given T y - t . Divgi (1981), Morris (1982), and Yen ( 1 983 ) consider first- 
order equity, which is a weaker case of the concept of equity first proposed 
by Lord (1980). 

Consider the single group design with no common items, and assume that no 
context effects exist relative to the fact that examinees take both forms. 
For this design the congeneric model for test forms X and Y can be specified 
as 

X = T + E - (A T + « ) ♦ E and (69) 

X X X X X 

Y = T ♦ E « (A T ♦ 6 ) ♦ E . (70) 

y y y y y 

It follows that 

T - (A /A XT - 6 ) ♦ 6 - HKT ) . 
y y x x x y x 

Consequently, 

satisfies the condition of first-order equity in Equation 68, because the 
expected value of g(X) in Equation 71, given t, equals the expected value of Y 
given 1 , for all x. 

To show that g(X) in Equation 67 satisfies first-order equity, it is 
sufficient to show that it equals Equation 71. From Equations 69 and 70, 
M(T x ) - A x u(T) ♦ 5 x , u( T y ) - A yM (T) * 6 y , 
o(T ) « A o(T) , and o(T ) - X o(T) . 

xx y y 
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Replacing these equations in Equation 67 gives 



A 

g(X) ° X - A p(T) - fi 





A u(T) ♦ 6 

y y 




X 




which is identical to Equation 71. 

The same type of logic has been applied by Hanson (1990) in the much more 
complicated context of the carjnion- items nonequi valent- populations design. 
Specifically, except for slight notational differences, Hanson (1990) uses the 
following congeneric model for the test forms and common items: 

H 2 = T 2 ♦ E h , (72) 



where 1 and 2 designate populations, Tj (i = 1,2) is the true-score random 

variable corresponding to the observable score H*, Y - H + V, and X » K ♦ V. 

Further, the error variances are assumed to satisfy the assumptions or the 

classical congeneric model, and the constraints imposed are A * 1 and 

h 

6 h - 0 . Given this modeling, Hanson (1990) shows that Levi ne' s true-score 
equating procedure satisfies first-order equity, for both internal and 
external sets of common items. 

Note that Hanson's modeling of congeneric forms in Equations 72-7<? 
differs considerably from that discussed in previous sections of this paper. 
In particular, Equations 72-75 directly relate V to both X and Y in a single 



Ki = (A k T s ♦ <$ k ) ♦ E . and 
V, - U v T, ♦ fi y ) ♦ E Vi , 



i7b) 



(7k) 



(73) 
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model. In other words, Equations 72-75 constitute one model with one A y 
term for V f whereas In previous sections the common- items nonequi valent- 
populations design was framed in terms of separate congeneric models for the 
two forms, which involves two effective test length parameters for V. 

Summary and Discussion 
In this paper, two different parameteri zations of a classical congeneric 
model have been introduced to model explicitly the variables in the Levine 
observed-score and true-score linear equating procedures, for the external and 
internal anchor cases. The models differ in the constraints imposed on the 
effective test length parameters, A^ and A y , as well as assumptions made 
about one covariance term, o(E x ,E v ). With an external anchor the model 
employs the constraint A ♦ A * 1 and asstsnes o(E , E ) * 0, whereas with an 

XV XV 

internal anchor A is set to 1, and it is assumed that o(E ,E ) = A a 2 ( E) . 

X XV V 

Using these two parameter! zations, relatively simple expressions are easily 
obtained for true-score variances, reliabilities, and error variances. 
Further, the error variances are equal to the so-called "Angoff error 
variances." 

Applying these two paramet eri zati ons of the classical congeneric model 
with the Levine assumptions leads to general equations (for both of Levine* s 
procedures and both the external and internal anchor cases) that involve 
ratios of effective test length parameters. This aids interpretation. 

The derived results are summarized in Table 1, where l(X) and g(X) are 
the linear functions for the observed-score and true- score methods, 
respectively. There are similarities between the expression of sane of the 
results in Table 1 and other expressions of results for the Levine procedures 
(notably, Angoff, 1984, pf. 115, and Kolen & Brennan, 1987, p. 272). For 
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example, as in Kolen and Drennan (1 987), results are expressed in terms of 

synthetic population weights, means and variances that are directly 

observable, and certain Y-terms. (Kolen & Brennan, however, provide results 

for the observed- score case, only.) Also, for w, = n,/(n, + n 2 5 

and w 2 = n 2 /(nt ♦ n 2 ) the results in Table 1 are algebraically equivalent to 

those presented by Angoff (1984). 

There are, however, several differences between the expression of results 

summarized in Table 1 and other expressions. First, the Y-terms are all 

expressed as ratios of effective test length parameters for the two 

parameterizations of the classical congeneric model used in this paper. This 

fact enhances the i nterpretabil ity of the Y-terms. For example, with an 

exclusive anchor, it is evident that y x increases as the effective test 

length of X increases relative to V in population 1. Second, the Y-terms are 

the same for both the observed-sco^e and true-score methods. Third, the 

effective test length parameters are all slopes in a particular linear 

regression. In general X_ = a.(F|*) where i = 1 or ?, F is X, Y. or V, ^nd * 

F i 1 

is a total score involving F. Fourth, the linear function for the true-score 
method, g(X), can be obtained using expressions for synthetic group means and 
variances that involve synthetic por ation weights, but g(X) itself is blind 
to such weights. This is a notable difference between the observed-score and 
true-score methods — a difference that has not been reported previously. 

The assumptions about true scores and error variances for both of the 
Levine methods are the same. What distinguishes the methods is the nature of 
the linear functions. For the observed- score method, the linear function 
relates converted observed scores on X to scores on Y. For the true-score 
method, however, the basic linear function relates true scores, but it is 
applied to observed scores. Consequently, for the true-score method, it in 
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not clear how the converted scores on X are in any sense comparable to scores 
on Y. Recently, however, Hanson (1990 5 has shown that Levine's true-score 
method satisfies the condition of first-order equity under a particular 
parameterization of the classical congeneric model. Of course, this does not 
necessarily mean that Levine's true-score method is pref erable to Levine f s 
observed-score method, but Hanson's proof casts new light on Levine f s true- 
score method* 

Although the two Levine methods are not properly distinguished in terms 
of being derived under assumptions about equally reliable and unequally 
reliable tests, there is a relationship between the two methods that involves 
reliabilities. In particular, if there exists a particular synthetic 
population in which X and Y are equally reliable [i.e., p (X,X') » p (Y,Y') 

S 9 

for a particular w t (and w 2 * 1 - w x )], then 

o (T ) o (Y) PJY,Y') o (Y> 
s y 3 s s 

o (T ) " o (X) p (X,X'> " o (X) ' 

3X33 S 

i(X) * g(X) for this synthetic population, and for both methods the converted 
scores on X for the synthetic population will have the same mean and variance 
as the scores on Y. Note that this equivalence does not necessarily hold for 
every synthetic population, however. 

Sometimes the following question is asked: "When tests are equally 
reliable, why doesn't Levine's unequally reliable procedure give the same 
results as Levine's equally reliable procedure?" This seemingly sensible 
question, however, is somewhat misleading and ambiguous. It is misleading 
because, as shown in this report, the generalized version of the Levi ne-Angof f 
results summarized in Table 1 can be derived without any assumptions about 
reliability. The so-called "equally reliable" procedure is simply the 
oboerved-score method denoted £(X), and the so-called "unequally reliable" 
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procedure is simply the true-score method denoted g(X). The question is 

ambiguous because it fails to recognize the role of the synthetic population 

in obtaining £(X). For example, suppose p,(X,X~) » p 2 CY,Y'), which implies 

that X and Y are equally reliable for the populations that actually took X and 

Y. It does not follow, however, that p (X,X') p (Y ,Y ' ) for the particular 

s s 

synthetic population actually used. Thus, tt is quite possible for forms to 
be equally reliable in some sense without having £.(X) * g(X). 

Levine's (1955) methods make assunipti ons about true scores. 
Consequently, to apply these methods, one must employ some model that relates 
observed and true scores. Levine employed classical test theory assumptions, 
and expressed many of his results in terms of reliability coefficients. 
However, he gave only limited consideration to how such coefficients might be 
estimated. Angoff*s (193^) results are based on estimating these coefficients 
using observed variances and Angoff f s (1953) error variances, In this paper, 
two specific classical congeneric models are used to derive results for 
Levine ? s methods. These results can be viewed as more general versions of 
Angoff's results, although they are derived and expressed differently. 

Since Levine f s methods require some model for the relationship between 
observed and true scores, models other than the classical congeneric model 
could lead to different results. In particular, the multi -factor congeneric 
model discussed by Feldt and Brennan (1989, p. 111), or one or more models in 
general izabi l ity theory, might be employed with Levine f s methods . The 
principal point is that improved estimates of true-score variances, error 
variances, or reliabilities could lead to improved results. Also, 
improvements might resuit from relaxing one or more of Levine f s assumptions. 
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Appendix 

Proof that o 2 (T ) - Yfo a (T ) 
a x 1 s v 

In general, it is easy to show that 
°s (T x 5 c w »°» ( y + *20l(T x ) ♦ v,w,C|i,(T x > - y 8 (T x )] a . 



(AO 



For the classical congeneric model m,(T ) - Mi(X) and u 2 (T ) » u 2 (X) . Also, 

X X 

under Levine's assianptions Kolen and Brennan (1987, Equation 32) show that 

MX) - Ui(X) - Co,(T )/o,(T )3[y»(V) - n 2 (V)3 . 
It follows that 



°s <T x ) = W »^ ( V * W 2 °K T X > + WjWjCoKy/oKyDLy^V) - y 2 (V)] : 



o?(T ) 



WjofCT ) ♦ w 2 o|(T ) + w.WaCp.tV) - » 2 (V)] 8 

of(T ) X 



(A2) 



Under the Levine assumptions, the slope of the linear function of T x on T y is 



the same in populations 1 and 2. This means that 

Oi(T x )/o t (T y ) - o 2 (T x )/0 2 (T y ) . 
Applying Equation A3 to the second term in braces in Equation A2 gives 



o!(T ) = 



o?(T x ) 



S X 



— < 



w»of(T v ) ♦ w i0 ^(T v ) ♦ w,w 2 [u 4 (V) - y 2 (V)] 2 



(A3) 



The term in braces is o*(T ) t and by Equation 42 of(T )/o 2 (T ) - Y 2 . Thus 

S V XV 



as was to be proved. 
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Table 1 



Equations for Levine*s Observed-Score [£(X)] 
and True-Score [g(X)3 Methods 



U) - [o (Y)/o (X)] fX - u (X)] * u (Y ) 

3 S S3 

g(X) « LO (T )/o (T )] [X - u (X)3 ♦ y (Y) 
s y 3 X S s 

- (Y a /Y,)[X - M,(X)] ♦ y a (Y) ♦ Y,[u t (V) - y a (V)j 
y (X) - w, (X) ~ W a Y,C Ul (V) - Ma(V>3 
u (Y) = y a (Y) ♦ •v.Y.CmCV) - M 2 (V)3 

o 2 (X) - o?(X) - w a Yf[o?(V) - af(V)] ♦ w 1 w a YfCw»(V) - m 2 (V)] 2 
o 2 (Y) * o|(Y) ♦ w 1 Y|[o?(V) - o|(V)3 ♦ w x w a Y|[ Ul (V) - u 2 CV)] 2 
o 2 (T ) = Y 2 o 2 (T ) 

S X 1 3 V 

o 2 (T ) - Y|o 2 (T ) . 
s y s v 

wh-re o 2 (T ) = w,o 2 (T ) ♦ w a o|(T ) ♦ w,w a Cni(v) - y 2 (V)] 2 



External 


Anchor 


(A = X + V, B = Y ♦ V) (Classical congeneric model) 


Yj 


A 


a,(X{A) o 2 (X) * o,(X,V) 


3 — IS 

A 

v. 


~j(V(A) " c 2 (V) * o,(X,V) 


Y a 


A 

y 2 


a a (Y|P) o J (Y) * o a (Y,V) 


A = 


a a (V|B) ' o|(V) ♦ o 2 (Y,V) 


Internal 


Anchor 


(Classical congeneri c model ) 


Y, 


= 1 /A 


- 1 /a, (V|X) - of (X)/o t (X,V) 


Y 2 


= 1 /A 

v 2 


» 1/a 2 (V|Y) = s 2 (Y)/c 2 (Y.V) 



Note , For Tucker's method use y x = ctj(Xjv) and = a2 (y|v) in 
l(X) for both the internal and external anchor cases. 
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